Avoiding an XSLT Processor crash due to deep recursive processing

By  Dimitre Novatchev
 
Language  XSLT
Category designpatterns, namespace, xml, msxml
Posted 24  Mar  2001
Updated 25  Mar  2001
 
Summary
Here's a typical FAQ on XSLT processing
(http://sources.redhat.com/ml/xsl-list/2001-01/msg00303.html):
 
"Ok, I think the answer is "no", but I'm still curious to ask.
 
Say I have an XML doc that contains something like:
 
  <Tag ID="1">
    <Value>4</Value>
  </Tag>

  <Tag ID="2">
    <Value>2</Value>
  </Tag>
 
... and I want to end up with something like:
 
<table>
  <tr id="1">
    <td>
    </td>

    <td>
    </td>

    <td>
    </td>

    <td>
    </td>
  </tr>
</table>

<table>
  <tr id="2">
    <td>
    </td>

    <td>
    </td>
  </tr>
</table>
In other words, I want to create a set of new nodes, the count of which is based upon a *value* contained in the document.
 
I'm using MSXML 3.0, so I know I can extend functionality via the MSXML:SCRIPT tag (ok, so I haven't actually tried it yet, but I've found some close examples). But I was wondering if there was any way to do this via the standard functionality set."
 
And the typical answer which I and almost any other XSLT programmer give is (look also at the snippet of Martin Naughton:
 
(http://www.vbxml.com/snippetcentral/main.asp?view=viewsnippet&id=v20010126065631):
 
"Robert,
 
Call a (recursive) named template that has a parameter the number of nodes to be created. It will create just one node and in case more nodes need to be created -- will call itself with the parameter decreased by one."
 
Another, non-recursive solution exists, which I have been describing in the xsltalk forum.
 
Here's its general description as specified in the answer to the message above
(http://sources.redhat.com/ml/xsl-list/2001-01/msg00308.html):
 
"In case a maximum value limiting the possible number of nodes to be created is known in advance, then a non-recursive solution can be applied. It is similar to the solution of the string-padding problem, but uses a pre-defined tree containing the maximum nodes rather than a predefined string containing the maximum number of spaces."
 
At this time I didn't know that what I was describing was already known as the method of Wendell Piez.
 
Here's his own particular application of this method to Robert's problem:"
 
"Another way to do this is to fake a loop using the nefarious <xsl:for-each> ....
 
<xsl:template match="TAG">
  <TABLE>
    <TR ID="@ID">
      <xsl:for-each select="//*[position() &lt;= Value]">
        <TD>
        </TD>
      </xsl:for-each>
    </TR>
  </TABLE>
</xsl:template>
 
Of course, this is a truly horrible departure from the True Way and I'd have to report to the authorities anyone I discovered doing it. (Not only is it unconscionable heresy; it only works if the value of Value never gets higher than the total number of nodes in your source document.)"
 
Below I present a small generalisation which is independent of the number of nodes in the XML source document and uses the number of nodes in the stylesheet instead:
 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" omit-xml-declaration="yes" />

  <xsl:template match="Tag">
    <xsl:variable name="vValue" select="Value" />

    <TABLE>
      <TR ID="{@ID}">
        <xsl:for-each select="(document('')//*)[position() &lt;= $vValue]">
          <TD>
          </TD>
        </xsl:for-each>
      </TR>
    </TABLE>
  </xsl:template>
</xsl:stylesheet>
 
This uses the capacity of the stylesheet for element nodes only.
 
This capacity will be considerably increased if we test for more types of nodes like this:
 
<xsl:for-each select="($st//node()| $st//@* | $st//namespace::*)[position() &lt;= Value]">
where $st has been defined as document('') -- that is the root node of the stylesheet.
 
Question: What to do if we need to perform a very, very long loop and even the full supply of stylesheet nodes will not be sufficient?
 
Answer: Increase the number of nodes in the stylesheet. A lot of "nodes stuff" can be included in a stylesheet if necessary. This additional content will have nothing to do with the processing done by the stylesheet, except that its many nodes will count.
 
Any stuff can be included under a namespace-uri, different from the standard XSLT namespace-uri, like this:
 
<xsl:stylesheet version='1.0'
     xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
     xmlns:Shakespeare="Shakespeare">

  <Shakespeare:Hamlet>
    <!-- Hamlet goes here -->
  </Shakespeare:Hamlet>

  <!-- the rest of the stylesheet -->
</xsl:stylesheet>
As it can be seen, I included Jon Bosak's xml version of "Hamlet" (http://www.stoa.org/bard/).
 
The following small template will return the number of nodes (the maximum recursion depth that can be avoided) we can now use to organise non-recursive loops:
 
<xsl:template match="/">
  <xsl:variable name="st" select="document('')"/>
  <xsl:value-of select=" count($st//node() | $st//@* | $st//namespace::*)"/>
</xsl:template>
And this number is: 32032. As some people noted, this solution could be called an ugly one. Or, to quote an xml fragment from our stylesheet,
 
<SPEECH>
  <SPEAKER>MARCELLUS</SPEAKER>
  <LINE>Something is rotten in the state of Denmark. </LINE>
</SPEECH>
However, it is much more ugly to watch the slow and painful agony of IIS crashing only because we used a beautiful, deep recursive processing.
 
When a future version of MSXML will implement (at least tail-) recursion as iteration, then we might stop needing good ole Shakespeare -- at least from within our stylesheets.
 
Until then -- what can be more aesthetically awarding than having some best pieces of literature in your stylesheet?
 
BTW, I also learned some new things from this. I thought it was Hamlet, who said that "rotten" phrase about Denmark.
 
Now, just a look at my stylesheet tells me it was actually MARCELLUS.
 
XML Source
 
XSL Stylesheet
 
 
XML/HTML Result:
 
 
Text Result: